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ABSTRACT 

Background 

Respiratory failure is a key feature of severe Covid-19 and a critical driver of mortality, but 
for reasons poorly defined affects less than 10% of SARS-CoV-2 infected patients. 

Methods 

We included 1,980 patients with Covid-19 respiratory failure at seven centers in the Italian 
and Spanish epicenters of the SARS-CoV-2 pandemic in Europe (Milan, Monza, Madrid, San 
Sebastian and Barcelona) for a genome-wide association analysis. After quality control and 
exclusion of population outliers, 835 patients and 1,255 population-derived controls from 
Italy, and 775 patients and 950 controls from Spain were included in the final analysis. In 
total we analyzed 8,582,968 single-nucleotide polymorphisms (SNPs) and conducted a meta¬ 
analysis of both case-control panels. 

Results 

We detected cross-replicating associations with rs 11385942 at chromosome 3p21.31 and 
rs657152 at 9q34, which were genome-wide significant (P<5xl0~ 8 ) in the meta-analysis of 
both study panels, odds ratio [OR], 1.77; 95% confidence interval [Cl], 1.48 to 2.11; 
P=1.14xlO -10 and OR 1.32 (95% Cl, 1.20 to 1.47; P=4.95xl0" 8 ), respectively. Among six 
genes at 3p21.31, SLC6A20 encodes a known interaction partner with angiotensin converting 
enzyme 2 (ACE2). The association signal at 9q34 was located at the ABO blood group locus 
and a blood-group-specific analysis showed higher risk for A-positive individuals (OR=1.45, 
95% Cl, 1.20 to 1.75, P=1.48xl0" 4 ) and a protective effect for blood group O (OR=0.65, 95% 
Cl, 0.53 to 0.79, P=1.06xl0" 5 ). 

Conclusions 

We herein report the first robust genetic susceptibility loci for the development of respiratory 
failure in Covid-19. Identified variants may help guide targeted exploration of severe Covid- 
19 pathophysiology. 
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Introduction 

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was discovered in Wuhan in 
China late 2019 and rapidly evolved into a global pandemic. 1 As of May 28 th 2020, there are 
over 5.1 million confirmed cases worldwide, with total deaths exceeding 355,000 (access 
John Hopkins). In Europe, Italy and Spain were early severely affected with epidemic peaks 
starting in the second half of February 2020 (Figure 1) with 60,189 fatal cases reported by 

th 2 

May 28 2020. Coronavirus disease 2019 (Covid-19) has variable behavior, with the vast 
majority of infected individuals experiencing only mild or even no symptoms. 3 Mortality rates 
are predominantly driven by the subset of patients developing severe respiratory failure 
secondary to bilateral interstitial pneumonia and acute respiratory distress syndrome. 4 Severe 
Covid-19 with respiratory failure requires early and prolonged support by mechanical 
ventilation. 5 

The pathogenesis of respiratory failure in Covid-19 is poorly understood, but mortality 

6 8 

consistently associates with older age and male gender. " Clinical associations have also been 
reported for obesity and cardiovascular disease traits, hypertension and diabetes in particular, 
but the relative role of these risk factors in determining Covid-19 severity has not yet been 
clarified. 6 9 Observations on lymphocytic endothelitis and diffuse microvascular and 
macrovascular thromboembolic complications may suggest that Covid-19 is a systemic 
disease that primarily injures the vascular endothelium, but provide mostly hypothetical 
insights to the underlying pathogenesis in severe Covid-19. 10 " 12 On this background, at the 
peak of the epidemic in Italy and Spain, we performed a genome-wide association study 
(GWAS) to possibly delineate host genetic factors contributing to respiratory failure in Covid- 
19. The relatively low Covid-19 disease burden in Norway and Germany allowed for a 
complementary team to be set up, whereby rapid analysis could occur in parallel with rapid 
patient recruitment in the affected Italian and Spanish epicenters. 
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Materials and methods 

Study subjects / recruitment 

We recruited in total 1,980 patients with severe Covid-19 infection defined by hospitalization 
with respiratory failure and confirmed SARS-CoV-2 viral replication from nasopharyngeal 
swabs or other relevant biological fluids cross-sectionally from intensive care units and 
general wards of seven hospitals in five cities in the pandemic epicenters in Italy and Spain 
(Table 1 and Supplementary Table 1A); Fondazione IRCCS Ca Granda Ospedale Maggiore 
Policlinico, Milan (n=597), Humanitas Clinical and Research Center, IRCCS, Milan (n=154), 
San Gerardo Hospital School of Medicine, Monza (n=200), At Hospital Clinic and IDIBAPS, 
Barcelona (n=56), Hospital Universitario Vail d'Hebron, Barcelona (n=337), Hospital 
Universitario Ramon y Cajal, Madrid (n=298), Donostia University Hospital, San Sebastian 
(n=338). Respiratory failure was defined in the simplest possible manner, to ensure 
feasibility, by requirement of oxygen supplementation or mechanical ventilation, with the 
severity graded according to maximum respiratory support up until time of blood sampling 
(oxygen therapy only, non-invasive ventilatory support, invasive ventilatory support, 
extracorporeal membrane oxygenation). Severity was also binarized to no mechanical 
oxygenation vs. mechanical oxygenation for severity assessments. Whole blood or buffy coats 
from diagnostic venipuncture were collected for DNA extraction. 

For comparison, we included a total of 2,381 controls from Italy and Spain (Supplementary 
Table IB). We recruited 998 randomly selected blood donors at Fondazione IRCCS Ca 
Granda Ospedale Maggiore Policlinico, Milan with no evidence of Covid-19 who were 
genotyped for the purpose of the present study. We also included two control panels with 
genotype data derived from previous studies using the same genotyping array; from Italy 
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n=396 controls from reference 13 and from Spain n=987 controls recruited from blood donors 
(San Sebastian). 

Ethical approval 

The project protocol outlined a rapid patient inclusion with principally no additional project- 
related procedures (material from clinically indicated venipunctures) and with the opportunity 
of complete anonymity with only minimal data collected. Differences in recruitment and 
consent procedures between centers were determined by 1) some centers integrating the 
project in larger Covid-19 biobanking efforts and others doing dedicated inclusion for this 
project and 2) variability regarding the local ethical committee handling of anonymization vs. 
deidentification as well as consent procedures. Written informed consent was obtained from 
all study subjects at each center when possible, alternatively exempt as defined by delayed 
consent, oral consent or consent via next of kin was collected depending on local ethical 
committee regulations. For some severely ill patients, where this was not possible, an 
exemption from informed consent was obtained by the local ethical committee or per local 
regulations during the Covid-19 pandemic to allow the use of completely anonymized surplus 
material from diagnostic venipuncture. 

The following ethical approvals of the project were obtained from the relevant ethics 
committees: Germany: Kiel (reference number D464/20); Italy: Fondazione IRCCS Ca 
Granda Ospedale Maggiore Policlinico (reference number 342_2020 and 334-2020 for cases 
and controls, respectively), Humanitas Clinical and Research Center, IRCCS (reference 
number 316/20), San Gerardo Hospital School of Medicine, Monza (the ethics committee of 
the National Institute of Infectious Diseases Lazzarro Spallanzani reference number 84/2020); 
Norway: Regional Committee for Medical and Health Research Ethics in South-Eastern 
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Norway (reference number 132550); Spain: Hospital Clinic, Barcelona (reference number 
HCB/2020/0405), Hospital Universitario Vail d'Hebron, Barcelona (reference 
PR [AG]244/2020), Hospital Universitario Ramon y Cajal, Madrid (reference number 093/20) 
and Donostia University Hospital, San Sebastian (reference number PI2020064). 

Sample processing, genotyping and imputation 

We performed DNA extraction from all 1,980 cases and 1,394 Italian controls using a 
Chemagic 360 from PerkinElmer (Waltham, Massachusetts, U.S.) using the low volume kit 
cmg 1491 and the Buffy coat kit cmg-714 (Chemagen, Baesweiler, Germany), respectively. 

For genotyping, we used the Illumina’s Global Screening Array (GSA) version 2.0 (GSA 
v2.0; Illumina Inc., San Diego, USA) that contains 712,189 variants before quality control 
(QC). Details on genotyping and QC procedures can be found in the Supplementary 
Methods. To maximize genetic coverage, we performed single nucleotide (SNP) imputation 
on genome build GRCh38 using the Michigan Imputation Server and 194,512 haplotypes 
generated by the Trans-Omics for Precision Medicine (TOPMed) program (Freeze 5). 14 After 
excluding samples during QC (the majority of which were due to population outliers, see 
Supplementary Methods and Table IB) the final case-control datasets comprised 835 
patients and 1,255 population derived controls from Italy, and 775 patients and 950 controls 
from Spain, with a total of 8,965,091 SNPs included for the Italian cohort and 9,140,716 
SNPs for the Spanish cohort. 

Statistical methods 

To take imputation uncertainty into account, we tested for phenotypic associations with allele 
dosage data separately for both Italian and Spanish case-control panels through the use of 
PFINK’s logistic regression framework for dosage data (PFINK vl.9). 15 Two adjusted 
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association analyses including covariates from principal component analysis (PCA) were 
conducted (analysis I and II) to control for (I) potential population stratification as well as (II) 
potential population stratification as well as age and gender bias. A fixed-effects meta¬ 
analysis was conducted using the meta-analysis tool METAL 16 on variants overlapping 
between both studies using the BETA and its standard error (SE) from the study specific 
association analyses. We used the commonly accepted threshold of 5 x 10" 8 for joint P-values 
to define statistical significance. 

Based on results from TOPMed genotype imputation, we utilized three ABO SNPs 
(rs8176747, rs41302905 and rs8176719) 17,18 to infer ABO blood type and calculated blood 
group specific odds ratios according to A vs. B/AB/O, B vs. A/AB/O, AB vs. A/B/O and O 
vs. A/AB/B (see Supplementary Methods). To assess in detail the HLA complex at 6p21, at 
which we found no genome-wide significant associations in the main analysis, we performed 
sequencing-based HLA typing of 7 HLA loci (. HLA-A , -C, -5, - DRB1 , -DQA1, - DQB1 , - 
DPB1) in the subset of 916 cases from Italy and 1,087 cases from Spain (see Supplementary 
Methods), and assessed allelic distribution according to no mechanical ventilation (oxygen 
supplementation only) vs. mechanical ventilation any type (Table 1). A similar assessment 
was made for lead SNPs rsl 1385942 and rs657152, and at these broader loci (3p21.31 and 
9q34.2) we also performed Bayesian fine-mapping analysis (see Supplementary Methods). 
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Results 

The milestones of the study in the context of the peak outbreaks in Italy and Spain are shown 
in Figure 1. Age, gender and maximum respiratory support up until time of blood sampling 
for patients included in the final analysis are given in Table 1 and Supplementary Table 1. 
By utilizing GSA-only data, we were able to perform a uniform quality control of merged 
Italian and Spanish batches, thus reducing potential batch effects, and conducted Italian and 
Spanish association analyses (see Supplementary Methods and Supplementary Figure 2 
and 3). Quantile-quantile (Q-Q) plot of the two meta-analyses of Spanish and Italian 
association results revealed significant associations in the tail of the distribution with minimal 
genomic inflation (A, G c=1.015 and X G c=1.006, respectively; Supplementary Figure 4). 

We found two loci to be associated with Covid-19 induced respiratory failure with genome- 

o 

wide significance (P<5xl0" ) in the meta-analysis (analysis I) (Figure 2 and Table 2), the 
rsl 1385942 insertion-deletion GA/A SNP at chromosome 3p21.31, OR me ta 1-77 (95% Cl, 

1.48 to 2.11), P=1.14xlO’ 10 and the rs657152 A/C SNP at 9q34.2, OR me ta 1.32 (95% Cl, 1.20 
to 1.47), P=4.95xl0" 8 . Both loci showed nominally significant association in both the Spanish 
and Italian sub-analysis (Table 2). Manual inspection of genotype cluster plots of genotyped 
SNPs in associated regions showed distinct genotype clouds for homozygous and 
heterozygous calls. Furthermore, an age and gender corrected analysis (analysis II) 
corroborated observations at both rsl 1385942 OR me ta 2.11 (95% Cl, 1.70 to 2.61), P=9.46x10" 
12 and rs657152 OR met a 1.39 (95% Cl, 1.22 to 1.59), P=5.35xl0" 7 (Table 2; Supplementary 
Figure 5). A further 24 genomic loci showed suggestive evidence (P<lxl0 5 ) for association 
with Covid-19 induced respiratory failure in analysis I (Supplementary Table 2; 
Supplementary Figure 6). 
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Association signals at 3p21.31 and 9q34.2 were fine-mapped to 22 and 38 variants, 
respectively, with greater than 95% certainty (Figure 3A and 3B and Supplementary Table 
3). The association signal at 3p21.31 comprised six genes (Figure 3A and Table 2). We 
found that the frequency of the risk allele GA of the lead SNP, rsl 1385942, was higher in 
patients with mechanical ventilation compared with those receiving oxygen supplementation 
only with an OR=L70, 95% Cl, 1.27 to 2.26, P=0.00033, in the unadjusted analysis 
(Supplementary Table 4). Available database entries suggest that the frequency of the risk 
allele of rs l 1385942 varies between populations world-wide and is monomorphic in China 
(Supplementary Figure 7). 

At 9q34.2 the association signal was restricted to the ABO blood group gene (Figure 3B and 
Supplementary Figure 8). Accordingly, the distribution of ABO blood groups, as predicted 
from combinations of genotypes of three different SNPs was skewed in Covid-19 patients 
with respiratory failure compared with controls (Supplementary Table 5), with higher risk 
for A-positive individuals (meta-analysis result OR=1.45, 95% Cl, 1.20 to 1.75, P=1.48xl0~ 4 ) 
and a protective effect for blood group O (meta-analysis result: OR=0.65, 95% Cl, 0.53 to 
0.79, P=1.06xl0~ 5 ; see Supplementary Table 5 for details). Both associations and effect 
directions were consistent in the separate Spain-only and Italian-only case-control analysis 
(see Supplementary Table 5). We found no difference in blood group distribution between 
patients receiving oxygen supplementation only vs. mechanical ventilation of any kind 
(Supplementary Table 5). 

Since several viral infections are known to be controlled by genetic variation at the HLA 
complex at chromosome 6p21, we scrutinized the extended HLA region (chr6:25-34Mb; 
Supplementary Figure 9). There were no SNP or allele associations signals at the HLA 


12 


medRxiv preprint doi: https://doi.org/10-1101/2020.05.31.20114991. this version posted June 2, 2020. The copyright holder for this preprint 
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. 

All rights reserved. No reuse allowed without permission. 


complex meeting neither genome-wide nor suggestive association significance threshold of 
P=lxl0" 5 (Supplementary Table 6). Furthermore, we found no significant differences in 
allelic distribution between patients with oxygen supplementation only and those with 
mechanical ventilation of any kind (assessed by direct HLA typing, data not shown). 
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Discussion 

Using a pragmatic approach with simplified inclusion criteria and a complementary team of 
clinicians at the European Covid-19 epicenters in Italy and Spain and available German and 
Norwegian scientists, we were able to perform a complete GWAS for Covid-19 respiratory failure 
in about two months. We detected cross-replicating findings at chromosome 3 and chromosome 9, 
which achieved genome-wide significance in meta-analysis of both study panels. 

On chromosome 3p21 the peak association signal covers a cluster of several genes with potentially 
relevant functions to severe Covid-19. One notable candidate is SLC6A20 , which encodes the 
Sodium/Imino-acid (proline) Transporter 1 (SIT1) that functionally interacts with angiotensin 
converting enzyme 2 (ACE), the SARS-CoV-2 cell surface receptor. 19,20 SIT1 expression in the 
lungs is mainly present in pneumocytes 21 , where SIT1 should be scrutinized for involvement in 
SARS-CoV-2 viral entry. However, the relevant locus also contains a cluster of genes encoding 
chemokine receptors, including the CC-motif chemokine receptor 9 (CCR9) and the C-X-C motif 
chemokine receptor 6 (CXCR6), the latter have been shown to regulate the partitioning of lung- 
resident memory CD8 T-cells throughout the sustained immune response to airway pathogens, 
including influenza viruses. 22 In the publicly available results from the Covid-19 Host Genetics 
Consortium 23 , a similar association has been observed in an analysis of Covid-19 affected cases vs. 
a population based sample, however not at genome-wide significant levels, still corroborating our 
observations. These parallel observations with our analysis, which focused on severe cases with 
pulmonary failure only, points to the relevance of ascertainment bias in genetic studies of Covid-19, 
as clinically significant Covid-19 patients are more likely to be included in research projects than 
asymptomatic cases. The significantly higher frequency of the risk allele at the chromosome 3 locus 
found in the present study in patients requiring mechanical ventilation compared with oxygen only, 
provides further support to a role for this genetic region in modifying Covid-19 severity. 
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Preliminary clinical reports have suggested the involvement of ABO blood groups in Covid-19 

24 25 

susceptibility (preprints by Zhao et al. and Zietz et al. ). Similar reflections can thus be made for 
case ascertainment as for the chromosome 3 locus, and ABO blood groups have also been 

9 A 

implicated in SARS-CoV-1 susceptibility/ Our data thus aligns with the suggestions that blood 
group O is associated with lower risk compared with non-0 blood groups whereas blood group A is 

24 25 

associated with higher risk of acquiring Covid-19 compared with non-A blood groups. ’ ~ Unlike 
for Chromosome 3, we found no difference between patients receiving oxygen supplementation 
only and those with mechanical ventilation any kind. 25 However, it should be noted that the lead 
SNP at the ABO locus in our study (rs657152) has been associated with elevated interleukin-6 (IL- 

6) levels in childhood obesity in previous GW AS 27 , providing a hypothetical link to the established 

28 

association of elevated IL-6 with severity and mortality of Covid-19. Furthermore, genetic 
variation at the ABO locus has previously been associated with a number of procoagulant markers 
such as von Willebrand factor and Factor VIII, and the potential relationship between our genetic 
findings and the significant coagulopathy that is observed in severe Covid-19 warrants further 
attention. 

We are fully aware that the pragmatic aspects leading to feasibility of this massive undertaking in a 
very short period of time during extreme clinical circumstances of the pandemic led to certain 
limitations that will be important to explore in follow-up studies. For example, to enable 
recruitment of study participants, a bare minimum of clinical metadata was requested. For this 
reason, extensive genotype-phenotype elaboration of current findings could not be performed, and 
adjustments for all potential sources of bias (e.g. underlying cardiovascular and metabolic factors 
relevant to Covid-19) could not be done. Furthermore, the alignment of our findings with 
preliminary reports assessing Covid-19 susceptibility should lead to a critical debate as to 
phenotype definitions for cases and controls in genetic studies of Covid-19. Also, few restrictions 
during inclusion were made, leading to genotyped samples having to be excluded due to differing 
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ethnicities (genetically population outliers). That said, we took great care to minimize variability 
between cases and controls arising from such sources, and that could have been introduced from 

29 

differences between genotyping platforms" , e.g. limiting our inclusion of controls to those 
genotyped on the Illumina Global Screening Array, despite thus reducing our statistical power. 

Further exploration of current findings, both as to their utility in clinical risk profiling of Covid-19 
patients and mechanistic understanding of the underlying pathophysiology, is now warranted. 
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Table 1. Overview of patients included in final analysis. Male predominance among patients and 
the high age with medians >63 years are consistent among all centres. Sample numbers provided 


are post-QC for the GWAS (see also Supplementary Table 1C). 



Fondazione 

Humanitas 

San 

Hospital 

Hospital 

Hospital 

Donostia 


IRCCS Ca’ 

Clinical and 

Gerardo 

Clinic and 

Universitario 

Universitario 

University 


Granda 

Research 

Hospital 

IDIBAPS, 

Vail 

Ramon y 

Hospital, 


Ospedale 

Center, 

School of 

Barcelona 

d’Hebron, 

Cajal, 

San 


Maggiore 

IRCCS, 

Medicine, 


Barcelona 

Madrid 

Sebastian 


Policlinico, 

Milan 

Monza 






Milan 








(IT. A) 

(IT.B) 

(IT.C) 

(ES.A) 

(ES.B) 

(ES.C) 

(ES.D) 

N 

503 

140 

192 

45 

228 

201 

301 

Age, median 

64 (54-76) 

67 (57-75) 

66 (56-74) 

69 (59-75) 

65 (56-72) 

69 (60-79) 

67 (57-75) 

(IQR) 








Gender female, n 

159 (32) 

39 (28) 

51 (27) 

13 (29) 

78 (34) 

50 (25) 

124 (41) 

(%) 








Respiratory 








support, n (%) 








- Oxygen only 

0(0) 

70 (50) 

67 (35) 

7(16) 

105 (46) 

106 (53) 

255 (85) 

- Non-invasive 

399 (79) 

25(18) 

89 (46) 

6(13) 

7(3) 

16(8) 

0(0) 

ventilation 








- Ventilator 

104 (21) 

45 (32) 

33(17) 

31 (69) 

116(51) 

77 (38) 

46(15) 

-ECMO 

0(0) 

0(0) 

3(2) 

1(2) 

0(0) 

2(1) 

0(0) 


ECMO: Extracoiporeal membrane oxygenation; IQR: interquartile range 
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Table 2. Susceptibility loci associated with severe Covid-19 with respiratory failure. 









Meta-analysis 
(1610 cases / 2205 controls) 

Italy 

(835 cases / 1255 controls) 

Spain 

(775 cases / 950 controls) 

Chr 

Association boundaries 
(bp) 

dbSNP id 

Al 

A2 

Key genes 

Analysis* 

P 

OR 

(Cl 95%) 

P 

OR 

(Cl 95%) 

A1F 

cases 

A1F 

ctrs 

P 

OR 

(Cl 95%) 

A1F 

cases 

A1F 

ctrs 

3p21.31 

chr3:45800446-46135604 

rsl 1385942 

GA 

G 

SLC6A20, 
LZTFL1, 
FYCOl, CXCR6, 
XCR1, CCR9 

main 

1.14xlO' 10 

1.77 

(1.48-2.11) 

1.98xl0' 7 

1.74 

(1.27-2.38) 

0.14 

0.09 

1.32xl0' 4 

1.85 

(1.50-2.28) 

0.09 

0.05 

con - , age, 
gender* 

9.46x1 O' 12 

2.11 

(1.70-2.61) 

7.02xl0‘ 8 

1.95 

(1.53-2.48) 

0.14 

0.09 

1.17xl0" 5 

2.79 

(1.76-4.42) 

0.09 

0.05 

9q34.2 

chi-9:133257521- 
133279871 

rs657152 

A 

C 

ABO 

main 

4.95xl0' 8 

1.32 

(1.20-1.47) 

2.90xl0' 6 

1.37 

(1.20-1.57) 

0.42 

0.35 

3.55xl0' 3 

1.26 

(1.08-1.48) 

0.42 

0.35 

corr. age, 
gender* 

5.35xl0' 7 

1.39 

(1.22-1.59) 

5.31xl0' 5 

1.37 

(1.17-1.60) 

0.42 

0.35 

2.81xl0' 3 

1.45 

(1.13-1.84) 

0.42 

0.35 


All association test statistics were adjusted for the top 10 principal components from principal component analysis. *Two analyses were performed, 
“main”, only correcting for principal components, and “corr. age, gender”, correcting for age and gender in addition to 10 principal components. In the 
corrected analysis, 25 controls are excluded from the Spanish and meta-analysis due to missing covariate data. 
Chr: chromosome of marker; Pos: Genomic positions were retrieved from NCBI’s dbSNP build vl53 (genome build hg38); Association boundaries: 
association boundaries for each index SNP (see Methods); dbSNP id: rs ID (rsl 1385942 is annotated as chr3:45834968-45834969:AAA:AA in 
dbSNPvl53 and as chr3:45834967:GA:G in TOPMed imputation reference panel); Al: minor and risk allele; A2: major allele; Key gene(s): candidate 
gene(s) in the region; P/OR: P-value and corresponding odds ratio and 95% confidence interval with respect to minor allele. For each panel, numbers 
of cases/controls are displayed in parentheses; AlF_cases: allele frequency of minor/risk allele 1 in cases; AlF_ctrs: allele frequency of minor/risk 
allele 1 in controls. 
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Figure Legends 

Figure 1. Timeline of rapid Covid-19 genome-wide association study (GWAS). The main events 
and milestones of the study are summarized in the plot along time (x-axis, weeks (W) are noted). 
Patient samples from three Italian and four Spanish hospitals were collected around the peak of the 
local epidemics and ethics applications were quickly obtained through fast-track procedures, i.e. 
every local ethical review board supported Covid-19 studies by rapid turn-around times, facilitating 
this fast de novo data generation. Within 6 weeks, all collected blood samples were centrally 
isolated, genotyped and analysed. The rapid workflow from patients to target identification 
illustrates the utility of GWAS, a standardized tool in research that often relies on international and 
interdisciplinary cooperation. One centre alone could not have completed this study, not mentioning 
the increase in statistical power through multi-centre patient contribution. Speed of data production 
depended heavily on lab automation and speed of analyses reflect existing analytical pipelines and 
generous support of public so-called “imputation servers” (here, the Michigan imputation server of 
the Abecasis group). 


Figure 2. Genome-wide analysis summary (Manhattan) plot of the meta-analysis association 

o 

statistics highlighting two susceptibility loci with genome-wide significance (P<5xl0‘ ) for 
severe Covid-19 with respiratory failure. Manhattan plot of the association statistics from the 
meta-analysis controlled for potential population stratification (strategy I). The red horizontal line 
indicates the genome-wide significance threshold of P=5xl0~ 8 . Supplementary Figure 5 shows 
Manhattan plots including also hits passing a suggestive significance threshold of P<lxl0' 5 (total of 
24 additional suggestive genomic loci; see also Methods and Supplementary Table 3). 


Figure 3. Regional association plots of susceptibility loci associated with severe Covid-19 with 
respiratory failure. Bayesian fine-mapping analysis (see Methods) prioritized 22 and 38 variants 
for loci 3p21.31 and 9q34.2, respectively, with greater than 95% certainty. LD values were 
calculated based on genotypes of the merged Italian/Spanish dataset derived from TOPMed 
imputation (see Methods). hg38 positions are plotted. 
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